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Abstract 

This paper presents a Bayesian algorithm for linear spectral unmixing of hyperspectral images 
that accounts for anomalies present in the data. The model proposed assumes that the pixel re¬ 
flectances are linear mixtures of unknown endmembers, corrupted by an additional nonlinear term 
modelling anomalies and additive Gaussian noise. A Markov random field is used for anomaly 
detection based on the spatial and spectral structures of the anomalies. This allows outliers to 
be identified in particular regions and wavelengths of the data cube. A Bayesian algorithm is 
proposed to estimate the parameters involved in the model yielding a joint linear unmixing and 
anomaly detection algorithm. Simulations conducted with synthetic and real hyperspectral images 
demonstrate the accuracy of the proposed unmixing and outlier detection strategy for the analysis 
of hyperspectral images. 


Index Terms 

Hyperspectral imagery, unsupervised spectral unmixing, Bayesian estimation, MCMC, anomaly 
detection. 


I. Introduction 

Spectral unmixing (SU) of hyperspectral images (HSI) has been the subject of intensive 
interest over the last two decades. It consists of distinguishing the materials and quantifying 
their proportions in each pixel of an observed image. The SU problem has been widely 
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Studied for applications where pixel reflectances are linear combinations of pure component 
spectra (called endmembers) ||T|, Q. However, as explained in Q, the linear mixing model 
(LMM) can be inappropriate for some hyperspectral images, such as those containing sand¬ 
like materials or where relief is present in the scene. Moreover, LMM-based methods can 
also fail when the data are corrupted by (sparse) outliers, especially when extracting the 
endmembers from the scene. Nonlinear mixing models (NLMMs) have been proposed in 
the hyperspectral image literature and can be divided into two main classes Q, 0. The 
first class of NLMMs consists of physical models based on the nature of the environment 
(e.g., intimate mixtures Q and multiple scattering effects @-0). The second contains more 
flexible models allowing a wider range of nonlinearities to be approximated Q, |10|. 

Here, we consider a general mixing model for spectral unmixing which assumes that the 
observed pixels result from a convex combination of the endmembers of the scene, corrupted 
by an additive term modelling deviations from the classical LMM (e.g., outliers, variability, 
nonlinear effects) and additive Gaussian Noise. The number of endmembers is assumed to 
be known whereas their spectral signatures are unknown. It is interesting to note that many 
nonlinear models in the literature, including polynomial models 0-0 can be expressed in 
a similar manner. Here, the additional terms are assumed to be a-priori independent of the 
endmembers and/or their proportions (abundances), as in pT| , p^ . This class of models 
for robust linear SU allows for general deviations from the LMM to be handled in blind 
source separation methods, i.e., nonlinear effects, outliers or possible endmember variability 


|13|. In [121, spatial and spectral sparsity structures were considered for the additional term 
since deviations from the LMM can occur in specific regions or spectral bands of the HSI. 
This is typically the case when outliers are present, but also when nonlinear effects (relief) 
occurs and when the reflectance of materials present has significant variations in particular 


spectral ranges (e.g. due to natural variability of vegetation). In this paper, we extend [ [T2 | 
by introducing a probabilistic 3D Ising model for spatial and spectral influence of outliers 
thus allowing for more flexible group-sparsity structures for the support sets of anomalies. 


whereas [12| assumed the support sets of outliers to have a fixed structure. Moreover, the 
algorithm presented in this paper allows the estimation of the Ising model parameters directly 
from the data. 

Adopting a Bayesian framework, we assign prior distributions to the unknown model 
parameters to include available information (such as parameter constraints) within the es¬ 
timation procedure. In particular, an Ising Markov random field is introduced to model 
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spatial and spectral correlations for the anomalies. The joint posterior distribution of the 
unknown parameter vector is then derived. Since classical Bayesian estimators cannot be 
easily computed from this joint posterior, a Markov chain Monte Carlo (MCMC) method is 
used to generate samples according to this posterior. More precisely, we construct an efficient 


stochastic gradient MCMC (SGMCMC) algorithm p4| that simultaneously estimates the 
endmember and abundance matrices along with the Ising hyperparameters. 

The main contributions of this work are threefold: 

1) We develop a new hierarchical outlier model taking into account spatial and spectral 
correlations through Markovian dependencies, this contrasts with the model proposed 


in IT^ which considered a fixed outlier structure. This flexible model is embedded 
within the mixing model for robust unsupervised linear spectral unmixing via anomaly 
detection. 

2) An adaptive MCMC algorithm is proposed to compute the Bayesian estimates of 
interest and perform Bayesian inference. This algorithm is equipped with a stochastic 
optimisation adaptation mechanism that automatically adjusts the parameters of the 
Markov random field by maximum marginal likelihood estimation, thus removing the 
need to set the regularisation parameters by cross-validation. 

3) We show the benefits of the proposed flexible model for linear spectral unmixing of 
synthetic and real hyperspectral images. Specifically, we demonstrate the ability of 
the proposed algorithm to detect structured anomalies thus enhancing endmember and 
abundance estimation. 

The remaining sections of the paper are organized as follows. Section |I^ introduces the 


mixing model for robust linear SU of HSIs, followed by Section III which summarizes the 
likelihood and the priors assigned to the unknown parameters of the model. The resulting 
joint posterior distribution and the Gibbs sampler used to sample from it are summarized in 


Section IV A generalization of the proposed Bayesian model for robust Bayesian subspace 
identification is proposed in Section |Vj Some simulation results conducted on synthetic data 
are shown and discussed in Section |V^ Conclusions and future work are reported in Section 
lYml 

II. Problem formulation 

We consider a set of N observed pixels/spectra y„ = ■ ■ ■, 2/L,n]^, n G {1,..., A^} 

where L is the number of spectral bands. Each of these spectra is assumed to result from a 
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linear combination of R unknown endmembers m^, corrupted by possible additive outliers 
and Gaussian noise. The observation model can be expressed as 

R 

Yn ^ ^ T 

r=l 

= Ma„ + r„ + e„, n = l,...,iV (1) 

where mr = , mr,L]^ is the spectrum of the rth material present in the scene 

and ttr^n is its corresponding proportion (abundance) in the nth pixel. In ([T]), e„ is an 
additive independently but non identically distributed zero-mean Gaussian noise sequence 
with diagonal covariance matrix Sq = diag(cr^), denoted as e„ ~ (e„; Ol, Sq), where 

cr^ = [cr^,..., ajy' is the vector of the L noise variances and diag(cr^) is an L x L diagonal 
matrix containing the elements of the vector cr^. Moreover, r„ denotes the outlier vector 
of the nth pixel. Note that the usual matrix and vector notations M = [mi,..., and 
o-n = [ai,n, • • •, ctR,n]^ havc bccn used in the second row of ([T]). 

As a consequence of physical constraints, the abundance vectors satisfy the following 
positivity and sum-to-one constraints 

R 

= > 0,Vr e {1,... . (2) 

r=l 

The problem investigated in this paper is to estimate the endmember matrix M, the 
abundance matrix A = [ai,..., aAr], the noise variances in cr^ and the outlier matrix 
R = [ri...,rAr] from the observation matrix Y = [yi,..., yiv]. To solve this problem, 
we propose a hierarchical Bayesian model and a sampling method to estimate the unknown 
parameters. 

III. Robust Bayesian Linear Unmixing (RBLU) 

A. Likelihood 

Eq. ([T]) implies that yn|M, a„, r„, cr'^ M (y„; Ma„ + r„, Sq). Assuming independence 
between noise sequences of the N observed pixels, the likelihood of the observation matrix 
Y can be expressed as 
/(Y|M,A,R,<t2)oc 



(Y - MA - R)^So ^(Y - MA - R)' 

1 0 1 t! Li 

2 


where oc means “proportional to” and etr(-) denotes the exponential trace. 
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B. Parameter priors 

1) Prior for the abundance matrix A; Each abundance vector can be written as a„ = 
[c^, with Cn = [ai,n,..., aR_i^nY OR^n = «r,n- The EMM eonstraints @ 

impose that c„ belongs to the simplex S = |c |c, > 0, Vr e 1,..., i? - 1, Ef=“i' c. < l} . To 
refleet the laek of prior knowledge about the abundanees, a uniform prior is assigned for eaeh 
veetor c„, n e {1,..., A}, i.e., /(c„) cx Is (c„), where I5 (■) is the indieator funetion defined 
on the simplex S. When prior knowledge about the abundanees are available, the uniform 
prior ean be replaeed by more informative priors sueh as (mixtures of) Diriehlet distributions 


1151 or Gaussian mixtures using logistie eoeffieients [16|. Assuming prior independenee 
between the N abundance vectors at leads to the following joint prior distribution 


N 


/(c) =n/(>=")• 


(4) 


71 = 1 


where C = [ci,..., cjv] is an {R — 1) x N matrix. 

2) Prior for the endmember matrix M.- To refleet the laek of prior knowledge about the 
endmembers, we use the following multivariate truneated Gaussian prior 

R 

/(M|OocnA/'(R+)-(mr.;0,eiL) (5) 


r=l 


where ^ is fixed to a large value, to ensure endmember positivity while using a weakly 
informative prior. Note that @ is eonsidered in order to handle the ease where the data are 
not normalized. If the data are aetual refleetanee values, a prior ean be introdueed ensuring 


that the endmember speetra belong to (0,1), sueh as a uniform, beta [17| or Gaussian 
distribution. Note that the prior ean also inelude prior information from an endmember 


extraetion algorithm, as in p8| |, p9|. 

3) Prior for the noise variances: A Jeffreys’ prior is chosen for the noise varianee in 
eaeh speetral band a'j, i.e., /(o-|) cx (cr|) where 1®+ (■) denotes the indieator funetion 

defined on M+, whieh re fleets the absenee of knowledge about these parameters. Again, these 
non-informative priors can be easily replaced by conjugate inverse-Gamma priors to include 
prior knowledge available about the noise levels. Assuming prior independenee between the 
noise varianees, we obtain 

L 

= ( 6 ) 


i=i 
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4) Priors of the outliers: As in [10|, [12|, the outliers are assumed to be sparse, i.e., at 


most of the pixels and speetral bands, the outliers are expeeted to be exaetly equal to zero. 
To model the outlier sparsity, we faetorize the outlier matrix as 


R = Z ©X, 


(7) 


where Z G {0,1}^^^ is a label matrix, X G and 0 denotes the Hadamard (termwise) 

produet. This deeomposition allows one to deeouple the loeation of the sparse eomponents 
from their values. More preeisely, zi^n = = 1 if an outlier is present in the fth speetral 

band of the nth observed pixel with value equal to ^ A conjugate Gaussian prior 

is used for X, i.e.. 


/(X|s2) = s^) , 


( 8 ) 


£,n 


where controls the prior energy of the outliers. Note that @ allows the outliers to be 
negative. Other conjugate priors, such as exponential or truncated Gaussian priors, could be 
used instead of @, e.g., to enforce outlier positivity. The next paragraph presents the prior 
considered for the label matrix Z. 

5) Label matrix: For many applications, the locations of outliers are likely to be spec¬ 
trally (e.g., water absorption bands) and/or spatially (e.g. weakly represented components, 
shadowing effects) correlated. An effective way to take correlated outliers/nonlinear effects 
into account is to consider Markov random fields (MRF) to build a prior for the label matrix 


Z [10|. MRFs assume that the distribution of a label ze^n conditionally to the other labels 


of the image equals the distribution of this label vector conditionally to its neighbors, i.e., 
P(^£,„|Z\^^„) = P(zi^n\'Zve,J^ where is the index set of the neighbors of ze,n, 
denotes the matrix Z whose element ze^n has been removed and Zy^,,, is the subset of 
Z composed of the elements whose indexes belong to Vi^n- In this study, we consider 
that the spatial and spectral correlations can be different and thus consider two different 
neighborhoods. We decompose the neighborhood as Ve^n = U where (resp. 

denotes the spatial (resp. spectral) neighborhood of Zi^n- In this paper, we consider an 
Ising model that can be expressed as 

1 


P(Z|/3') = -^exp[/3^0(Z) + 0o(Z,/3o)] 


(9) 
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where (3 = [I^n.I^lY, (3' = [f3'^,l3of and 

0L (z) = J2n,e S{ze^n — Z£'^n), 

^ 4>N (Z) = Y2n,i — Ze^n'), 

0(Z) = [0i(Z),0^(Z)f, 

00 (Z, 0o) = 00 Sn,£(l “ + (1 “ 0o) 'l2n,e 

and ^(■) denotes the Kroneeker delta funetion. Moreover, I3n > 0 and 0 l > 0 are hyperpa¬ 
rameters that eontrol the spatial and speetral granularity of the MRF and 0 < 0o < 1 is an 
additional parameter that models the probability of having outliers in the image. Speeifieally, 
the higher the value of 0o> the lower the probability of outliers in the data. The estimation 
of the proposed Ising model hyperparameters will be diseussed in the next seetion. Different 
speetral and spatial neighbourhoods ean be used in Q. In this paper, we eonsider a 4- 
neighbour strueture to aeeount for the spatial eorrelation and a 2-neighbour strueture for the 
speetral dimension. 

C. Outlier variance 

The following eonjugate inverse-Gamma prior is assigned to 

s2~X^(7,z/), (10) 

where (7, a) are fixed to (7, u) = (10“^, 10“^) to ensure a weakly informative prior. This 
ehoiee of hyperparameters refleets the laek of prior information about the outliers varianee. 
However, eould be replaeed by more informative priors by suitably adapting (7, u ). 

D. Joint posterior distribution 

Assuming the parameters M, A, Z, X and cr^ are a priori independent, the joint posterior 
of the parameter veetor 6 = {M, A, X, Z, cr^} and the parameter ean be expressed as 

/(0, Y, 0, (3') oc /(Y|0)/(0|5^ e, /3 )/(s"|7, (11) 

where 

/(0|s^ e, /3') = /(M|0/(A)/(^2 )^(x|s2)P(Z|/3') (12) 

and 0 = [^,7,1^]^ a veetor of model hyperparameters. The MRF parameter veetor (3' will 
be determined by maximum marginal likelihood estimation during the inferenee proeedure. 
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The directed acyclic graph (DAG) summarizing the structure of proposed Bayesian model is 
depicted in Fig. 

Next we describe an MCMC method for sampling from this posterior distribution to 
estimate the unknown model parameters. 

IV. Bayesian Inference 


The Bayesian model defined in Section specifies the joint posterior density for the 
unknown parameters 0, given the observations Y and the hyperparameters 7, and f3'. 
This posterior distribution models the complete knowledge about the unknown parameters 
given the observed data and the prior information available. We propose the following 
Bayesian estimators for hyperspectral unmixing and nonlinearity detection: the marginal 
posterior mean or minimum mean square error estimator of the abundance and endmember 
matrices 


^MMSE, MmMSE 


= E 


A,M 


Y,0,/3 


(13) 


where the expectation is taken with respect to the marginal posterior density /(A, M| Y, 0, f3 ) 
(by marginalizing out Z,X,cr^ and this density takes into account their uncertainty), the 
marginal maximum a posteriori (MMAP) estimator for the outlier support Z 


^ argmax 0, ^3 ), 


(14) 


and, conditionally on the estimated outliers location, the minimum mean square error esti¬ 
mator of the outlier values 


^MMSE 

nX 


= E 




r ^ 'W rh h 


(15) 
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where 


f{znAY,4^,^) = j /(0,s'|Y,0,^')d0W,.ds^ (16) 

where E [■] denotes the expeetation with respeet to the eonditional marginal density 

(17) 

/(^n,f|Y>,^') 

Note that the outlier estimator ( [TS] ) is sparse by eonstruetion {i.e., E 

0 ). 

Computing ( [T3] ), ( [T?] ) and ( [T5] ) is ehallenging beeause it requires aeeess to the joint 
marginal density of (M, A), the univariate marginal densities of Zn^g and the joint marginal 
densities of Zn,t)^ whieh in turn require eomputing the posterior ( [TT] ) and performing 
an integration over a very high-dimensional spaee. Eortunately, these ean be effieiently 
approximated with arbitrarily good aeeuraey by Monte Carlo integration. More preeisely, 
it is possible to eompute ( fT3| ), ( [T?] ) and ( [T5] ) by first using an MCMC eomputational method 
to generate samples asymptotieally distributed aeeording to ( fTTj ), and subsequently using 
these samples to approximate the required marginal probabilities and expeetations. Note that 
in ( [T3] ), ( pA] ) and ( [T5| ), we have set (3' = ^ , whieh denotes the maximum marginal likelihood 
estimator of the Ising regularisation hyperparameter veetor (3' given the observed data Y, 
i.e., 

P = argmax/ (Y|0, (3 '). (19) 

P'&B 

This is an empirieal Bayes approaeh for speeifying f3' where hyperparameters with unknown 
values are replaeed by point estimates eomputed from observed data (as opposed to being 
fixed a priori or integrated out of the model by marginalisation). As explained in [ [T4| , 
this strategy has several important advantages for MRE hyperparameters sueh as /3' having 
intraetable eonditional distributions. In partieular, it allows automatie adjustment of the 
value of (3' to eaeh image, produeing signifieantly better estimation results than using a 
single fixed value of f3' for all images. Eurthermore it has signifieantly lower eomputational 
eost eompared to that of eompeting approaehes, sueh as ineluding [3' in the model and 
subsequently marginalizing it during the inferenee proeedure p0|. 


X. 


n,£ 


^n,e 


= 0, Y,0,^ 
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A. Bayesian estimation algorithm 

Exact computation of the MMSE and MMAP estimators ( [T3| ), ( [T?] ) and is very 
ehallenging beeause it involves ealeulating expeetations with respeet to posterior marginal 
densities, whieh in turn require evaluating the full posterior ( [TT] ) and integrating it over a 

-V / 

very high-dimensional spaee. Exaet eomputation of j3 is also diffieult beeause it involves 
solving an intraetable optimisation problem (it is not possible to evaluate the exaet marginal 
likelihood f{Y\as) or its gradient V/(Y|a 3 )). Here we follow the approaeh proposed in 
10 and design a stoehastie optimisation and simulation algorithm to eompute ( [T3l ), ( [T4l ) and 
( [T5] ) simultaneously. We eonstruet an SGMCMC algorithm that simultaneously estimates ^ 
and generates a ehain of Nmc samples asymptotieally distributed aeeording 

to the marginal density /(M, A|Y, ^ ) (this algorithm is summarised in Algorithm ^below). 
Onee the samples have been generated, the estimators ( fT3| ), ( fT4| ) and ( fTS] ) are approximated 
by Monte Carlo integration [|^ Chap. 10], i.e., 

1 


M 


MMSE 


^MMSE 


MMAP 


N, 


MC 




bi 


^MC 

- ^ mw, 

t=A^bi + l 


N, 


MC 




^y aw, 

-Ybi ^ 

0 if cai:d{Zn,e) < {Nmc - Ym) /2 
1 else, 


MMSE 


eard(E:„,£) 

0 


else, 


MMAP 


= 1 


with Zn,i = e {Ybi + 1 ,..., Ymc}, ^ where the samples from the first 

Ybi iterations (eorresponding to the transient regime or bum-in period) are disearded. The 
main steps of this algorithm are detailed in below. 

1) Sampling the labels: It ean be seen from ( [TT] ) that 

f{zn,e = ^|Y, s^) oc V(n, i), 


( 20 ) 


where i G {0,1} and 


log ( 






2aj 


0^ct>{Z) 


[Z,/3o). 


( 21 ) 
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Consequently, the label Zn/ can be drawn from its eonditional distribution by drawing 
randomly from {0,1} with probabilities given by 


f{zn,e = i|Y, /3', s^) = 

2) Sampling the endmembers: It ean be easily shown that 

L 

/(M| Y, 0, (3') = n /(m,,: IY, . 0, 

i=i 


( 22 ) 


(23) 


i.e., the rows of M, denoted as {m^.} are a posteriori independent (eonditioned on the other 
parameters). Moreover, 






:(M) 


(24) 


where 


ni£,: = (T^ 

= {afAA^ + i-^\LY" (25) 


and . is the i\h row of the Lx N matrix Y = Y — R. Sampling from (24) ean be aehieved 


effieiently by using the Hamiltonian method reeently proposed in [22| or by sueeessive 
sampling from R truneated Gaussian distributions (via Gibbs sampling). 


3) Sampling the abundances: In a similar fashion to obtaining ( [23| ), it ean be easily shown 
that 


N 

/(C|Y, 6\c, 0, (3') = n /(CnlYn, 0\c„, s'), (26) 

n=l 

i.e., the eolumns of A are a posteriori independent (eonditioned on the other parameters). 
Moreover, the eonditional distribution of c„|y„, 0\c„, is a multivariate Gaussian distribution 
restrieted to the simplex S, whieh ean be sampled effieiently using the method proposed 
in[|22|. 


4) Sampling the latent variable matrix X.- In a similar manner to the abundanee matrix in 
( |26l ), the elements of X are a posteriori independent (eonditioned on the other parameters) and 
ean be sampled independently. Sinee the prior ([^ is eonjugate to the Gaussian distribution, 
the full eonditional distribution of is given by 




(27) 
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where 


cr: 


^t,n ^£,n(jj£,n ^^£,-.^n) 2 


(Tt 


2„2 


a?s 


^£,n — 


(28) 


(yj + 

5) Sampling the noise variances: Sampling the noise varianees ean be easily aehieved by 
sampling from the following L independent inverse-Gamma distributions 






2.^ ||y£,: - mf,;A - 
2 


(29) 


6) Sampling the outlier variance s^; Finally, in a similar fashion to the noise varianees, 
it ean be shown that 


Y, 61,0 ~ ig 



(30) 


2 ” ^2 

n,£ 

7) Updating the Ising regularisation model parameter vector (3': If the marginal likelihood 
/(Y|0, /3') was traetable, we eould update jS' from one MCMC iteration to the next by using 
a elassie gradient deseent step 

/3'(*) = + 5tVlog/(Y|0,/3'O-i)), 

with 5t = sueh that /3'0) converges to ^9 as t ^ oo. However, this gradient has two 

levels of intraetability, one due to the marginalisation of {0,s'^) and another one due to the 
intraetable normalising eonstant of the Ising model. We address this diffieulty by following 


the approaeh proposed in [14|; that is, by replaeing Vlog/(Y|0,/3'O)) with an estimator 
eomputed with the samples generated by the MCMC algorithm at iteration t, and a set of 
two auxiliary variables (Z') ~ /C(Z|Z(9^generated with an MCMC kernel K. with 
target density Q (in our experiments we used a Gibbs sampler implemented using a eolouring 
seheme sueh that half of the elements of Z' are generated in parallel). The updated value 
is then projeeted onto the domain Bt = [0,Bt] x [0,Bt] x [0,1] to guarantee the eonstraints 
of f3N,(3L and and the stability of the stoehastie optimisation algorithm, where Bt is an 
arbitrarily large upper bound on /Sat and that ean be inereased at every iteration. In our 
experiments we have used Bt = 10. 

It is worth mentioning that if it was possible to simulate the auxiliary variables Z' exaetly 
from @ then the estimator of V log/(Y|0,used in Algorithmwould be unbiased 
and as a result would eonverge exaetly to ^ . However, exaet simulation from Q is 
not eomputationally feasible and therefore we resort to the MCMC kernel K, and obtain a 
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biased estimator of Vlog/(Y|0,/3'^* that drives to a neighbourhood of ^ [14|. 
We found that eomputing this is signifieantly less expensive than alternative approaehes, e.g., 


using an approximate Bayesian eomputation algorithm [20|, and that it leads to very aeeurate 
unmixing results. 

Algorithm 1 


RBLU algorithm 


1: Fixed input parameters: Number of endmembers R, number of burn-in iterations TVbi, total number of 
iterations N^c 
2: Initialization (t = 0) 

. Set , M(o) , , (t2(o) , 

3: Iterations (1 < f < Amc) 


4: Sample Z^^'> from 
5: Sample from ( |24l l 
6: Sample A*^*l from ( |26| l 
7: Sample from ^n\ 

8: Sample from ( |29l l 
9: Sample from 
10: if f < TVbi then 

11: Sample Z' - /C(Z|ZW,/3'(‘"^)) 

Set with 


12 : 

13: 

14: 

15: 

16: 

17: 


A/Itv = 

Set = iP| 


A/3l = 5t 


^\ogm0)-^\ogf{Z'\0) 
[o.St] + A/3l) with 


^Jogf{Z\0) - ^0ogf{Z'\0] 


Set 


= ^[0,1] {Po + A/3o) with 


d/3o 


A/3o = 

18: else 
19: Set 

20: end if 

21: Set f = f-f 1. 


log/(Z|/3')-5|-log/(Z'|^') 


Note that in Algorithm 'P[o,Bt](-) denotes the projeetion onto [0,i?t]. In this paper, 
we propose an MCMC method that sequentially updates Z, M, A, X, cr^ and at eaeh 
sampler iteration. However, some of these variables, sueh as (Z, X),(M, X) or (A,X) eould 
be updated simultaneously or eould be re-sampled within a given sampler iteration, thus 


improving the mixing properties of the proposed sampling seheme, e.f. [12|. 
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V. Generalization to Robust Bayesian Principal Component Analysis 


As mentioned previously, one of the main eontributions of this paper is the introduetion 
of a 3D Ising model to model the possible eorrelation between the outlier loeations. This 


outlier model has been applied to linear SU in Section III However, this model can be 
applied to many other blind source separation (subspace identification) and inverse problems 
(nonlinear spectral unmixing) where the observations are corrupted by additive Gaussian noise 
and outliers. In this section, we discuss the generalization of the robust Bayesian principal 


component analysis model studied in p^ , p3| . As in p3| , we consider the following 
observation model (expressed in matrix form) 


Y = L + R + E 


(31) 


where R (resp. E) are L x N matrices representing the outliers (resp. the additive Gaussian 
noise) and L = DAW^ corresponds to a low rank matrix. The L x R matrices D and W 
are matrices of the left- and right-singular vectors, respectively, and A = diag ([Ai,..., Ar]) 
is a diagonal matrix consisting of the singular values {Ar.}r=i,Following the model 


considered in [12|, [231, we can set A^ = CrVr where Cr G {0,1}, which allows estimation 
of the dimension of the principal subspace. Since the columns of E are i.i.d., i.e.. 


Ar(e„;Oz,,So), Vn, 


(32) 


we can assign appropriate prior distributions to cr^,D,W,{Cr} and {T]r}, and consider 
the Bayesian model proposed in Section III for the outlier matrix R (i.e., Eqs. 0-([T^). 
The resulting robust Bayesian principal component analysis model can then be used to 
estimate the data principal subspace in the presence of non-i.i.d. additive Gaussian noise 
and possibly correlated outliers. As the posterior distribution of the anomalous outliers can 
also be estimated, the model also allows the outlier matrix to be estimated. The sampling 


strategy associated with Bayesian model is similar to those proposed in [12|, [231 and that 


presented in Section IV and is not further developed here due to space constraints. 

VI. Simulations using synthetic data 


A. Linear mixtures corrupted by outliers 

The performance of the proposed method, referred to as “RBLU” (Robust Bayesian linear 
unmixing), is investigated on two synthetic 60 x 60 pixel hyperspectral images composed 
of i? = 3 endmembers and observed at L = 207 different spectral bands (see Fig. [^. 
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The abundances of the two images are uniformly distributed in the simplex, defined by 
the positivity and sum-to-one constraints, and the noise variances set to aj = 
corresponding to an average SNR of 30dB without any anomaly addition. The first image 
Ji does not contain outliers whereas the parameter controlling the outlier power has been 
set to = 0.1 for the second image I 2 . The label matrix of I 2 has been generated using 
(|^ with (3 = [0.25; 0.25; 0.55]^ which leads to approximately 10% of actual outliers in R. 
The proposed method has been applied to the images with Nuc = 1000 iterations (including 


Nhi = 300). The endmember matrix was initialized using VCA [24| and the abundance matrix 
was initialized using FCLS ||T|. The combination VCA-FCLS is also used for performance 
comparison. 

The quality of the unmixing procedures can be measured by comparing the estimated and 
actual abundance vector using the root normalized mean square error (RNMSE) defined by 


RNMSE = 


\ 


1 ^ 
-y ll«r 


NR 




(33) 


n=l 


where a„ and a„ are the actual and estimated abundance vectors for the nth pixel of the 
image. The quality of endmember estimation is evaluated by the spectral angle mapper (SAM) 
defined as 

(m^, nir) 


SAM = arccos 


(34) 


|m.r|| ||mr 

where is the rth actual endmember and 111 ^ its estimate. The smaller the value of |SAM|, 
the closer the estimated endmembers to their actual values. 

Table |I] compares the performance of the proposed method and the VCA-ECES unmixing 
strategy and shows that the proposed methods outperforms VCA-ECES in terms of abundance 
and endmember estimation. Moreover, the confusion matrix of the proposed outlier detection 
method in Table illustrates the ability of the method to identify the corrupted data. 


Table III compares the abundance estimation performance of RBEU to o-ECES (which 
assumes perfectly known endmembers) for different outlier corruption scenarios (proportions 
and variances) and two noise settings, = 10“^ and = 10“^ (which correspond to SNR 
of approximately 30dB and 20dB, respectively, when considering data without anomalies). 
This table shows a general performance degradation of the algorithms when the number of 
outliers increases. However, although RBEU also estimates the endmembers (jointly with 
the abundances), the performance degradation is less severe for RBEU than for o-ECES by 
virtue of RBEU’s outlier detection ability. It is interesting to note that RBEU is also less 
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RBLU 

VCA-FCLS 

o-ECLS 



mi 

0.21 

0.68 

- 

h 

SAM (xl0“2) 

m 2 

0.17 

0.92 

- 


m 3 

0.26 

1.96 

- 


RNMSE (xlO 

■") 

0.68 

1.60 

0.67 



mi 

0.19 

4.03 

- 

h 

SAM (xl0“^) 

m 2 

0.20 

3.08 

- 


m 3 

0.29 

4.26 

- 


RNMSE (xlO 

■") 

0.74 

8.27 

6.95 


TABLE I 

Estimation performance. 



2 = 0 

2 = 1 

Total 

2 = 0 

652724 

7190 

659914 

2 = 1 

789 

84497 

85286 

Total 

653513 

91687 

745200 


TABLE II 

Outlier detection ih )'. confusion matrix. 


Noise 

Outlier 

= 

= 0.01 

= 

= 0.1 

variance 

fraction 

RBLU 

o-FCLS 

RBLU 

o-FCLS 


10 % 

0.77 

2.00 

0.74 

6.95 

<72 = 10 -" 

20 % 

0.80 

3.06 

0.77 

8.89 


30% 

0.86 

3.66 

0.87 

10.34 


10 % 

2.19 

2.77 

2.21 

6.00 

( 7 ^ = 10 "^ 

20 % 

2.36 

3.65 

2.39 

8.90 


30% 

2.59 

4.09 

2.62 

10.57 


TABLE III 

Abundance RNMSE (x 10“^) for different outlier energies and proportions. 


sensitive than o-FCLS to variations in the outlier varianee (o-FCLS abundanee estimation 
performanee deereases when the outlier varianee inereases). In addition, as the proportion of 
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Construction concrete 





Fig. 2. Actual endmembers (red lines) used to generate the synthetic images and endmembers estimated by VCA (black 
lines) and RBLU (dashed blue lines) for l 2 - 


outliers increases, sparsity ceases to be a reliable discriminant and it becomes increasingly 
difficult to detect the outlier samples. Similarly, if the variances of the noise and the anomalies 
are similar RBLU will not be able to detect the potential anomalies, confounding the outliers 
with a fictitious Gaussian noise having larger variance. 

For a 64-bit Matlab R2014b implementation on a 3GHz Intel Xeon quad-core workstation, 
RBLU required 30min to analyse each image composed of 60 x 60 pixels (3s for VCA-FCLS). 
Although RBLU can provide significantly improved endmember and abundance estimates, it 
has a higher computational cost than VCA-FCLS. 

Next we discuss the unmixing problem when both linearly and nonlinearly mixed pixels 
are present in the image. 
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B. Unmixing of linear and nonlinear mixtures 

As mentioned previously, many nonlinear models of the speetral unmixing literature ean be 
expressed as ffl. This is the ease, for instanee, for polynomial models [[^-[[^ introdueed to 
handle multiple scattering effects. The proposed model ([T]) seems particularly well adapted to 
detecting nonlinear effects that are often spatially localized, e.g., regions with significant relief 
variations, and/or spectrally concentrated such as nonlinear interactions (if present) varying 
smoothly over wavelengths. To evaluate the performance of the RBLU algorithm in terms of 
endmember estimation, abundance estimation and nonlinearity detection, we synthesised a 
60 X 60 pixel image composed of the three endmembers considered in the previous paragraph. 
Most of the pixels (75%) were generated according to the classical LMM while the remaining 
25% were generated according to a bilinear model, namely, the generalized bilinear model 

0 


y„ ^ Ma„ 

R-1 R 

+ EE 

2=1 


(35) 


where 0 < < 1 characterizes the level of interaction between the endmembers rrij and 

uij in the nth pixel. This choice of nonlinear model is motivated by the fact the polynomial 
(and in particular bilinear) models, introduced to account for multiple scattering effects, have 
demonstrated improved performance for certain types of urban and vegetated areas 
The abundances of each pixel (linearly or nonlinearly) mixed were uniformly drawn from 
the simplex defined by the abundance positivity and sum-to-one constraints. The nonlinearity 


parameters in (35) were fixed to = 1, which corresponds to the choice made in Fan’s 


model [251. All pixels have been corrupted by a zero-mean additive Gaussian noise i.i.d, with 
variance cx^, corresponding to an average SNR of 28dB. Note that although the generated 
abundance vectors are spatially independent, the positions of the generated anomalies are 
spatially correlated due to the spatial organization of the linearly and nonlinearly mixed 
pixels (see Fig. (left)). The RBLU algorithm was applied to the data using Amc = 2000 
iterations (including Abi = 500). 

Table [TVl compares the estimation performance of RBLU to the results obtained with 1) 
BLU [[T§, 2) VCA-tFCLS Q, 3) NfindR -t FCLS [[T|, @, 4)o-FCLS Q, 5) NfindR 
followed by the gradient-based inversion step proposed in [[^ based on the GBM ( [35] ) 


(referred to as “NfindR -i- GBM” in the table), 6) VGA [24| followed by the gradient-based 
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inversion step proposed in Q based on the GBM ( [35] ) (referred to as “VGA + GBM” in the 
table), and 7) the GBM method [[^ applied to the data assuming the endmembers are known 
(referred to as “o-GBM” in the table for oraele-GBM). Due to its anomaly deteetion ability, 
RBLU generally provides better abundanee estimates than LMM-based methods. Although 
RBLU does not rely on the GBM, its provides better abundanee estimates than BLU, in 
partieular for nonlinearly mixed pixels. For this data set eontaining pure pixels, N-FindR and 
VGA ean provide better endmember estimates than RBLU, whieh would be different in the 
absenee of pure pixels. Indeed, in a similar manner to BLU, the proposed RBLU algorithm 
does not require the pure pixel assumption. However, due to spaee eonstraints, simulations 
eondueted on synthetie data whieh do not eontain pure pixels are not presented in this paper 
whieh foeuses on the anomaly deteetion eapability of RBLU (the interested reader is invited 


to eonsult [271 for additional results on data whieh do not eontain pure pixels). We diseuss 
this point in the next seetion in the eontext of the RBLU analysis of real hyperspeetral images. 



RNMSE 

SAM (xlO 



(xlO"^) 

mi 

m 2 

m 3 

o-FCLS 

4.44 

- 

- 

- 

VCA+FCLS 

4.46 

0.35 

0.55 

0.90 

NfindR + FCLS 

5.96 

0.65 

3.09 

0.80 

o-GBM 

2.28 

- 

- 

- 

VCA-l-GBM 

2.15 

0.35 

0.55 

0.90 

NfindR - 1 - GBM 

2.26 

0.65 

3.09 

0.80 

BLU 

4.03 

1.76 

1.94 

1.33 

RBLU 

3.81 

2.79 

3.56 

0.59 


TABLE IV 

Abundance and endmember estimation eor the synthetic image composed of linearly and 

NONLINEARLY MIXED PIXELS. 


Fig. [^(middle and right) shows the aetual and estimated outlier energy in eaeh pixel (i.e., 
||f„|| 2 ) and illustrates the ability of RBLU to deteet bilinear mixtures in an image eontaining 
both linear and non-linear mixtures. 

Finally, Table |V] eompares the exeeution time required by the different methods to analyse 
the 3600 pixels of the image eomposed of linearly and nonlinearly mixed pixels on the same 
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Fig. 3. Left: Actual location of the linearly (black pixels) and nonlinearly (white pixels) mixed pixels. Actual (middle) and 
estimated (right) anomaly energy in each pixel of the synthetic image composed of linearly and nonlinearly mixed pixels. 


o-FCLS 

1 

VCA+FCLS 

3 

NfindR + FCLS 

3 

o-GBM 

350 

VCA+GBM 

354 

NfindR + GBM 

358 

BLU 

583 

RBLU 

1526 


TABLE V 

Computational time (in seconds) eor the synthetic image composed of linearly and nonlinearly 

MIXED pixels. 


basis as before. Although the GBM-based method proposed [[^ takes longer than FCLS, it 
processes the pixels independently and successively. Its run-time could thus be improved {e.g., 
using parallelization) and its computational complexity could approach that of FCLS. BLU 
and RBLU are unsupervised unmixing algorithms, which jointly estimate the endmembers and 
abundances and are based on MCMC methods, which are more computationally demanding. 
Although some sampling steps can be performed in parallel, the underlying Gibbs samplers 
are intrinsically sequential processes that require a sufficient number of iterations to explore 
the posterior distribution of interest. RBLU is more computationally demanding than BLU as 
it includes additional sampling steps (labels and anomaly values) and the estimation of the 
Ising model regularization parameters. However, the results presented in this paper illustrate 
the benefits of the (structured) anomaly detection ability of RBLU on the endmember and 
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abundance estimation performanee. 

VII. Simulations using real hyperspectral data 
A. Moffett data set 

The first real image considered in this section is composed of L = 189 speetral bands 
and was acquired in 1997 by the AVIRIS satellite. The aequired image eovers a region over 
Moffett Field, CA. A subimage of size 50 x 50 pixels has been ehosen here to evaluate 
the performanee of RBLU. The AVIRIS Moffet Field dataset has been previously used for 
eomparing methods of linear Q, [18|, |28|-[30| and nonlinear Q unmixing. The subimage 
of interest is mainly eomposed of water, vegetation, and soil. As in in the previous subseetion, 
the RBLU algorithm was applied with Amc = 2000 iterations (ineluding Abi = 500). 

Fig.|^depiets the endmembers estimated by N-FindR and RBLU. Fig.j^eompares the abun- 
danee maps provided by RBLU to those obtained with N-FindR followed by FCLS. These 
figures show that the endmembers and abundanee maps are all in good agreement. In addition 
to the endmember and abundanee estimates, RBLU provides speetral and spatial information 
about the possible outliers/nonlinearities. Fig. (left) shows average speetral outlier energy 
over wavelength for the Moffett subimage. In addition to signifieant outlier levels near the 
water absorption bands, around 1400nm and ISOOnm, RBLU identifies important deviations 
from the linear mixing model for the speetral bands between 400nm and 800nm. Fig. 
(right) displays the estimated outlier energy (i.e., ||r„|| 2 ) map over all pixels in the Moffett 
seene and shows that the deviations from the linear mixing model are mainly loeated in the 
eoastal area, whieh is in agreement with the results obtained in 0, [ |28| . Fig. |7] eompares 
the estimated spectrum of the outliers in the pixel (40,31) (loeated in the eoastal area) to the 
estimated endmembers. Although RBLU promotes groups of outliers, it does not explieitly 
enforee spatial nor speetral dependeneies for the outlier values. However, the outliers in this 
pixel seem to be speetrally eorrelated. These results show that RBLU is able to distinguish 
struetured outliers from Gaussian noise. These results also show that the outlier spectrum and 
the vegetation speetral signature (in red) seem eorrelated, espeeially in the visible speetrum. 
A possible explanation for this eorrelation eould be loeal signifieant ehanges of vegetation, 
e.g., chlorophyll and water eontent, additional vegetation species, multiple seattering effects. 
Finally, the reeonstruetion errors obtained by N-FindR-i-FCLS and RBLU are eompared in 
Fig. Due to its outlier detection ability, RBLU provides lower reeonstruetion errors in the 
eoastal area, whieh is the region where outliers/anomalies are the most dominant. 
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Fig. 4. The i? = 3 endmembers extracted from the Moffett image by N-FindR (red) and RBLU (blue). 


B. Villelongue data set 

The second real image considered was acquired in 2010 in the Madonna project and col¬ 
lected by the Hyspex hyperspectral scanner over Villelongue, France (00°03'W and 42°57'N). 
L = 160 spectral bands were recorded from the visible to near infrared with a spatial 


resolution of 0.5m. This dataset has previously been studied in [10|, |311, [32| and is mainly 
composed of forested and urban areas. More details about the data acquisition and pre¬ 


processing steps can be found in [311. A sub-image of size 300 x 250 pixels is chosen here 
to evaluate the proposed unmixing procedure and is depicted in Fig. The scene is composed 
mainly of trees and grass, resulting in i? = 4 endmembers (soil, grass, trees and shade). 

Fig. [T^ compares the i? = 4 endmembers estimated by N-FindR and RBLU for this second 
real image. Although it is difficult to objectively assess the performance of the two EEAs for 
this image, it is interesting to note that the results obtained by the two methods are similar 
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Vegetation 




Fig. 5. The R — 3 abundance maps associated with the Moffett image and estimated by FCLS (top) and RBLU (bottom). 
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Fig. 6. Left: Average spectral energy of the outliers in the Moffett scene estimated by RBLU. Right: Estimated outlier 
energy |lf „||2 in each pixel of the Moffett image. 


for the tree, soil and grass speetra. The shade signature identified by RBLU has a lower 
amplitude than the speetrum estimated by N-FindR, whieh is probably due to the absenee of 
eompletely shadowed pixels in the image (as diseussed above, RBLU does not rely on the 
pure pixel assumption, in eontrast to N-FindR). The two methods however lead to abundanee 
maps (depleted in Fig. that are in agreement with the true eolor image in Fig. In 
partieular, the two algorithms are able to identify the path (soil) in the seene. This path is 
barely visible in Fig. but its presenee ean be eonfirmed using the Google Map image for 
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Fig. 7. Endmembers and outlier signature of the pixel (40,31) estimated by RBLU for the Moffett image. 



Fig. 8. Reconstruction errors of the Moffett image using N-FindR+FCLS (left) and RBLU. 


this region (see Fig. 12). 


Fig. 13 (left) depicts the anomaly energy map estimated by RBLU over the Villelongue 
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Fig. 9. True color image of the Villelongue area (left) and sub-image of interest (right). 



Tree Grass 




Wavelength (nm) Wavelength (nm) 


Fig. 10. The 7? = 4 endmembers extracted from the Villelongue image by N-FindR (red) and RBLU (blue). 
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image and highlights two main regions where signifieant deviations from the linear mixing 
model oeeur. The first region, on the top of Fig. (left) is loeated where trees are identified 
and the deviations are likely to be due to the presenee of different tree speeies. Note that 
this region ean also be identified in Fig. (right) (lighter green region). The seeond region 
representing a line in the eentre of Fig. (left) is more diffieult to interpret and is barely 
visible in the true eolor image. However, it is interesting to note that the anomalies in this 
region form a stripe whose energy is higher in the pixels eomposed of grass than it is in 
those eontaining trees. Consequently, it is reasonable to postulate that the physieal phenomena 
eausing deviations from the linear mixing model occur on the ground or below the surface, 
i.e., not in the canopy. One possible cause of these spectral signature changes could be 
the presence of different kinds of surface vegetation. Finally, Fig. [T^ presents examples of 


anomaly spectra of the first (green box in Fig. 13) and second (red box in Fig. 13) spatial 
regions. This figure shows that the deviations from the linear mixing model occur in specific 
spectral bands and are relatively similar within each spatial region. Moreover, the anomalies 
are spectrally different in the two regions, which confirms they are probably due to different 
physical phenomena. 


Finally, Table VI compares the Matlab execution time required to analyse the Moffett and 
Villelongue data using RBLU and N-FindR+FCLS. 



Moffett 

Villelongue 

RBLU 

1590 

18360 

N-FindR+FCLS 

3 

31 


TABLE VI 

Computational time (in seconds) to analyse the real images using RBLU and N-FindR+FCLS. 


VIII. Conclusion 

In this paper, we have proposed a Bayesian algorithm for robust linear spectral unmixing 
of hyperspectral images that performs joint endmember estimation, abundance estimation and 
outlier/anomaly detection. Appropriate prior distributions were used to enforce endmember 
and abundance positivity in addition to abundance sum-to-one constraints. Moreover, a 3D 
spatial-spectral Ising Markov random field was used to model correlations between outliers. 
Finally, an adaptive MCMC method was proposed to sample from the resulting posterior 
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distribution in order to estimate the unknown model parameters. Simulations conducted on 
synthetic data showed superior performance of the proposed method for linear SU and the 
detection of outliers in hyperspectral images. The proposed method was also applied to real 
hyperspectral images and provided interesting results in terms of outlier analysis. Future work 
might include generalization of the proposed method for non-Gaussian observation noise. 
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Fig. 11. The i? = 4 abundance maps associated with the Villelongue image and estimated by FCLS (right) and RBLU 
(left). 
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Fig. 12. Soil abundance map estimated by RBLU for the Villelongue image (left) and corresponding Google Map image 
(right) highlighting the presence of a path in the region of interest. 
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Fig. 13. Left: Anomaly energy map estimated by RBLU for the Villelongue scene. Right: sub-image of interest in true 
colors. 
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Fig. 14. Anomaly spectra in the green (top) and red (bottom) boxes depicted in Fig|13| 
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